Goto

Collaborating Authors

 covariate concept


GOOD: A Graph Out-of-Distribution Benchmark Supplementary Material Shurui Gui

Neural Information Processing Systems

GOOD provides 11 datasets with 17 domain selections. For covariate shift splits, given a domain selection, we sort the graphs/nodes by their domains and divide the data into a certain number of domains by specifying the split ratio. Consequently, to build a specific concept, each graph has a domain-label probability to be included in this concept. Therefore, we build each concept by scanning the whole dataset and selecting graphs to be included according to their probabilities. Similarly, in node classification tasks, we apply the screening process to nodes instead of graphs.


GOOD: A Graph Out-of-Distribution Benchmark Supplementary Material

Neural Information Processing Systems

GOOD provides 11 datasets with 17 domain selections. For covariate shift splits, given a domain selection, we sort the graphs/nodes by their domains and divide the data into a certain number of domains by specifying the split ratio. Consequently, to build a specific concept, each graph has a domain-label probability to be included in this concept. Therefore, we build each concept by scanning the whole dataset and selecting graphs to be included according to their probabilities. Similarly, in node classification tasks, we apply the screening process to nodes instead of graphs.


GOOD: A Graph Out-of-Distribution Benchmark

Gui, Shurui, Li, Xiner, Wang, Limei, Ji, Shuiwang

arXiv.org Artificial Intelligence

Out-of-distribution (OOD) learning deals with scenarios in which training and test data follow different distributions. Although general OOD problems have been intensively studied in machine learning, graph OOD is only an emerging area of research. Currently, there lacks a systematic benchmark tailored to graph OOD method evaluation. In this work, we aim at developing an OOD benchmark, known as GOOD, for graphs specifically. We explicitly make distinctions between covariate and concept shifts and design data splits that accurately reflect different shifts. We consider both graph and node prediction tasks as there are key differences in designing shifts. Overall, GOOD contains 11 datasets with 17 domain selections. When combined with covariate, concept, and no shifts, we obtain 51 different splits. We provide performance results on 10 commonly used baseline methods with 10 random runs. This results in 510 dataset-model combinations in total. Our results show significant performance gaps between in-distribution and OOD settings. Our results also shed light on different performance trends between covariate and concept shifts by different methods. Our GOOD benchmark is a growing project and expects to expand in both quantity and variety of resources as the area develops. The GOOD benchmark can be accessed via https://github.com/divelab/GOOD/.